knitr::opts_chunk$set(
  #collapse = TRUE,
  comment = "",
  fig.width = 4,
  fig.height = 4,
  message = FALSE,
  warning = FALSE,
  tidy.opts = list(
    keep.blank.line = TRUE,
    width.cutoff = 150
  ),
  options(width = 150),
  eval = TRUE
)

Introduction

In Vignette 6 we created a data frame called protPepNSA_AT5tmtMS2 that consists of all protein profiles, with each protein profile followed by its component peptide profiles. In this vignette we shall first calculate RSA transformed profiles for all proteins and peptides, and then compute the constrained proportional assignments (CPA) for all proteins and peptides in a form ready for export. Then we show how to use it to plot profiles for any protein and its component peptides, with outlier peptides labelled in the plot.

RSA transformations

First, we attach the protlocassign and protsummarize packages, the latter of which includes the protPepNSA_AT5tmtMS2 data frame. Note that for rows containing proteins, the peptide column contains just the protein name while for rows containing peptides, the peptide column contains a combined protein and peptide name. As in previous vignettes, for ease of presentation, we rename the embedded data frames to remove experiment specific designations (e.g., AT5tmtMS2).

library(protlocassign)
library(protsummarize)

protPepNSA <- protPepNSA_AT5tmtMS2
str(protPepNSA)
totProt <- totProtAT5
totProt

Next we extract the NSA reference profiles from the nine profile columns of protPepNSA:

refLocationProfilesNSA <- locationProfileSetup(profile=protPepNSA[, 4 + (1:9)],
                          markerList=markerListJadot, numDataCols=9)
round(refLocationProfilesNSA, digits=4)

Using the RSAfromNSA function described previously in Vignette 3, we calculate the RSA-transformed marker profiles:

refLocationProfilesRSA <- RSAfromNSA(NSA=refLocationProfilesNSA,
                              NstartMaterialFractions=6, totProt=totProtAT5)
round(refLocationProfilesRSA, digits=4)

We transform the protein/peptide profiles by taking the nine columns containing the profile data from protPepNSA and then, using the RSAfromNSA function described previously in Vignette 3, we calculate an intermediate nine-column data frame protPepRSA_trimmed of RSA-transformed profiles.

protPepRSA_trimmed <- RSAfromNSA(NSA=protPepNSA[, 4 + (1:9)],
                              NstartMaterialFractions=6, totProt=totProtAT5)
str(protPepRSA_trimmed)

Finally, we add back in the five reference columns as the first columns of protPepRSA and also the two columns listing the number of spectra and peptides per protein. The resulting data frame protPepRSA has the same structure as the original data frame protPepNSA.

protPepRSA <- data.frame(protPepNSA[, 1:4], protPepRSA_trimmed, protPepNSA[,14:15] )  # add in the ref columns
str(protPepRSA)

Plotting RSA protein and peptide profiles

Next, we identify rows with proteins only, and extract them. The resulting data frame, protRSA, parallels the structure of protNSA. We also extract the rows with peptides only in the data frame pepRSA.

protRSA.ind <- {protPepRSA$prot == protPepRSA$peptide}  # protein indicators
protRSA <- protPepRSA[protRSA.ind,]  # these are the data for proteins only
dim(protRSA)
pepRSA <- protPepRSA[!protRSA.ind,] # these are the data for peptides only

Now generate the constrained proportional assignments on proteins only, using RSA-transformed profiles.

data.frame(colnames(protRSA))
protCPAfromRSA <-  fitCPA(profile=protRSA[, 4+1:9],
                      refLocationProfiles=refLocationProfilesRSA, 
                      numDataCols=9)
str(protCPAfromRSA, strict.width="cut", width=65)

Here is the plot of TLN1 protein/peptide plots, with CPA estimates. Outlier peptide profiles are in orange. The header reports the number of peptides and spectra used to compute the protein profile, which in this case excludes outlier peptides and outlier spectra.

#windows(width=7.5, height=10)  # open a window 7.5 by 10 inches
protPepPlotfun(protName="TLN1", protProfile=protRSA[,5:15],
               Nspectra=T, pepProfile=pepRSA, numRefCols=4,
               numDataCols=9, n.compartments=8, 
               refLocationProfiles=refLocationProfilesRSA,
               assignPropsMat=protCPAfromRSA, 
               yAxisLabel="Relative Specific Amount")

Note that the outlier peptides do not contribute to the CPA analysis of the proteins, but these may be of interest. For instance, they may represent protein isoforms with distinct distributions. Thus, it may be useful to calculate the CPA estimates for all proteins and peptides. This can be accomplished using the following command. Since this may take over 15 minutes on a desktop, we don’t execute this here:

protPepCPAfromRSA <- fitCPA(profile=protPepRSA[,4 + 1:9],
                               refLocationProfiles=refLocationProfilesRSA, numDataCols=9)
str(protPepCPAfromRSA)

We next assemble the final CPA values for the protein/peptide data along with ancillary information, ready for export. Then we output the data to C:\temp\myProteinOutput; users will of course select their own directory.

protPepCPAfromRSAout <- data.frame(protPepRSA[,1:4], protPepCPAfromRSA, protPepRSA[,14:15])
protPepCPAfromRSAout$prot <- paste("`", protPepCPAfromRSAout$prot, sep="")
protPepCPAfromRSAout$peptide <- paste("`", protPepCPAfromRSAout$peptide, sep="")

setwd("C:\\temp\\myProteinOutput")
write.csv(protPepCPAfromRSAout, file="protPepCPAfromRSAout.csv", row.names=F, na=".") 

To output plots of all of the protein and peptide profiles into a single pdf file, we first use setwd to point to the desired output directory, and then we can set up a loop as follows:

setwd("C:\\temp\\myProteinOutput")
pdf(file="allProtPepPlotsRSA.pdf", width=7, height=10)
n.prots <- nrow(protRSA)
for (i in 1:n.prots) {
   protPepPlotfun(protName=protRSA$prot[i],
       protProfile=protRSA[,5:15], 
       Nspectra=T, pepProfile=pepRSA, numRefCols=4, 
       numDataCols=9, n.compartments=8, 
       refLocationProfiles=refLocationProfilesRSA,
       assignPropsMat=protCPAfromRSA, 
       yAxisLabel="Relative Specific Amount")
}
dev.off()

References

Jadot M, Boonen M, Thirion J, Wang N, Xing J, Zhao C, Tannous A, Qian M, Zheng H, Everett JK, Moore DF, Sleat DE, Lobel P (2016) Accounting for protein subcellular localization: a compartmental map of the rat liver proteome. Molecular and Cellular Proteomics 16, 194-212. doi:10.1074/mcp.M116.064527 PMCID: PMC5294208

Tannous A, Boonen M, Zheng H, Zhao C, Germain C, Moore D, Sleat D, Jadot M, Lobel P. (2020) Comparative Analysis of Quantitative Mass Spectrometric Methods for Subcellular Proteomics. Journal of Proteome Research. Journal of Proteome Research 19, 1718-1730. doi: 10.1021/acs.jproteome.9b00862



mooredf22/protsummarize2 documentation built on May 16, 2021, 10:12 p.m.